Modeling a system by means of Genetic-Fuzzy algorithms


Introduction

The Genetic Fuzzy Modeller (GFM) is a software tool for identifying systems through the I/O data, with the use of Fuzzy Logic and Genetic Algorithms.
GFM is made up of a fuzzy identification system, globally optimized by a Genetic algorithm.
Below, the the algorithm on which it is based, and an example of its use are briefly reported. la structure of the program, thealgorithmon which it is based, and an example of its use are briefly reported.

Structure

The Genetic Fuzzy Modeller receives an input of the I/O data for a system, and, by use of Genetic Algorithms, gives as an output the Fuzzy Model with a minimized identification error. The systems that can be dealt with are of the multi-I/O type.
GFM is made up of two main programs: Fuzzy Structure & Model Optimization, and Fuzzy Model Optimization.
The Fuzzy Structure & Model Optimization (F.S.M.O.) is able to determine the best Fuzzy Model structure of the input system, at the same time optimizing the current model.
The Fuzzy Model Optimization (F.M.O.), instead, starts from a pre-established structure for the model and optimization. It determines the model by considering the structure pre-established by the user, or previously determined by the F.S.M.O.
To determine the structure of a model means to determine which variables are conditioned in the antecedents in the model's fuzzy rules, and to determine the optimal number of memberships with which to cover the universe of discourse of a variable datum in order to maximize the model's performance in terms of minimizing the identification error.
All the input variables of a system appear, however, to be suitably weighted in the consequents, according to the following rule:

Ri : If x1 is A1 and x2 is A2 and ... and xk is Ak
then y = p0 + p1 x1 + p2 x2 + ... + pk xk ;

This format differs from the general form of a fuzzy rule:

Ri : If F(x1 is A1 , x2 is A2 , ... , xk is Ak)
then y = G (x1 , x2 , ... , xk ) ;

in order to have only and and connectives in the antecedents and a linear form for the consequents.

Using I/O data of the system to be interpolated, we must determine the following three sets of objects for the identification:

  1. x1, ..., xk, the variables of the implication premises. It should be noted that in the Sugeno format, not always do all the variables appear in the premises, but they always appear in the consequents, appropriately weighted by parameters determined as in point 3;
  2. A(x1), ... , A(xk) the membership functions that make up the fuzzy sets in the premises, which we will call premise parameters. The membership functions utilized are "Gaussian", that is to say they have the form: G(x) = exp( -(x - m)2 / 2v2 ).
They are therefore univocally determined by the two parameters mean(m) and variance (v).
  1. p0, p1, ... , pk, for each rule, the parameters of the consequents of the rules.

Performance Index or Fitness

In the Genetic Fuzzy Modeller, the fitness of a model is the inverse of the sum of the absolute value errors for all the learning patterns.

Identification Algorithm

The Identification Algorithm is made up of four main phases performed in cascade for each model required: choice of the premise variables, choice of the premise parameters (by means of the genetic algorithm), choice of the consequent parameters (by means of least squares), and evaluation of the current model.

Choice of the premise variables

Not always do the variables that appear in the consequents appear in the antecedents: some might be unconditioned ones. We may then ask: Which are the conditioned variables in the premises?
This choice implies a subdivision of the universe of discourse associated with such variables. And then, how many must the subdivisions be? The method of choosing proposed in the Genetic Fuzzy Modeller is an iterative procedure which determines, among all the input variables, those which prove to be most significant for identifying the system. The criterion governing this procedure is to minimize the identification error.

Choice of the premise parameters

The optimal premise parameters are sought for the chosen variables. These parameters are the mean and variance values of the Gaussians that make up the fuzzy sets of the current model, and they are optimized by means of a genetic algorithm. The criterion for the final decision regarding the best choice made at each step is again that of error minimization.

Choice of the consequent parameters

The consequent parameters are calculated with the least squares method, with regard to the choices made during the first two phases.

Evaluation of the current model

How to use the Genetic Fuzzy Modeller

To obtain a fuzzy model of the system for which we know only the I/O data, we proceed in the following way:
The system input data (patterns) are subdivided into two files, one of learning and one of testing.
The generic entry of the patterns file has the following shape:

[input1 input2 ... Inputn output1 output2 ... outputm]

The Fuzzy Structure & Model Optimization program is carried out, specifying the parameters required.
In the event that the fuzzy model structure is known, the Fuzzy Model Optimization program can be carried out which will avoid the first phase of the identification algorithm.
Then the learning of the fuzzy model is performed (position and shape of the fuzzy sets, which is to say the Gaussians mean and variance) using the system's learning patterns.
The universe of discourse of every variable is determined automatically according to the values read from the learning file.
After performing the identification algorithm, the program obtains a fuzzy model of the system with details of all the parameters, which is then tested on the test patterns. The resulting model is contained in the file agf.m.
By using these files, it is therefore possible to test the model in the Matlab environment with a program that is part of the software. The characteristic parameters of the system are:
a mpv vector which contains the mean and variance of the Gaussians, starting from the first membership on the first conditioned variable;
a parameter matrix of the memb consequents for each output of the system.
a matrix sout_i of the consequents parameters for each output of the system.

Examples of application

The example reports the interpolation of the data characterizing the acoustic pollution level with respect to the average number of vehicles passing by in a street, to the width of the street, and to the height of the buildings flanking it.
The acoustic measurements were made on streets belonging to various categories of urban areas: residential, commercial, and industrial.
The acoustic pollution meter was set in such way to acquire 100 samples per second of the Li noise sound intensity (dBA). Li (dBA).
The equivalent sound level was determined according to the following relation:

LAeqT = 10 log( 1/T sum( 10Li/10 ) )

where T = 3600s and Li is the sound level measured in one second.
The parameters to take into account when constructing a very accurate model are:

A model constructed in this way has a non-linear structure and proves hard to identify. We therefore preferred to consider a reduced set of parameters. In particular, the following were considered for constructing the model: The equivalent number of vehicles was defined as follows:

neq = ncars + c1 nmc + c2 nhv

where:
ncars is the number of automobiles (1/h);
nmc is the number of motorcycles (1/h);
nhv is the number of heavy vehicles (1/h).
The values c1 = 3 e c2 = 6 are two coefficients that allow us to translate the number of motorcycles and heavy vehicles into the equivalent number of automobiles. The model thus assumes the following expression:

LAeqT = f( neq, h, w)

At each measurement station, a videocamera was used to calculate off-line the number of vehicles passing.
Figures 1-4 report the parameters measured and used by the F.S.M.O. in the learning phase.
For each figure, the relative times for the above-mentioned sample measurements are shown in the abscissa, and the measurement units of the variable considered are given in the ordinate.
The model was identified by assigning 4 memberships to the first and second variables, whereas the third was left unconditioned and is present only in the consequents.
The parameters characterizing the fuzzy rule membership functions that describe such a system are:

 MEAN            VARIANCE         VARIABLE 
----------- ----------- ----------- 8476.430446 2582.942575 VAR1
5429.988403 2277.304193 VAR1
1052.769578 1512.498114 VAR1
1204.263200 674.549020 VAR1
13.937038 4.522798 VAR2
30.925380 6.046329 VAR2
26.436581 5.801685 VAR2
17.263657 2.289874 VAR2

The consequent parameters are written by the F.S.M.O. on the file in such was that every line forms the parameters of every optimum model rule.

 p0         p1         p2         p3 
---------- ---------- ---------- ---------- -124081.69 -4.39 6627.21 887.09
-154365.78 -19.75 10088.34 1291.26
101520.21 13.48 -8234.34 -826.70
-6465.90 4.20 338.89 -629.68
-110122.35 7.19 6927.98 -1102.32
1735498.52 1.81 -33420.57 -23588.96
556228.97 6.54 -10153.33 -9115.72
158025.05 -5.19 -12558.84 2092.47
-18260.62 0.94 965.66 -199.88
-3138716.97 4.26 60175.60 42511.54
-23287.71 -0.38 -608.58 1468.95
-15850.08 2.59 -415.28 530.47
5230.47 -0.18 -538.08 336.09
-12840.95 3.83 -158.62 208.65
-3864.33 1.80 58.73 -58.16
-21037.93 0.94 1498.03 -204.97

Figure 5 reports the result of the learning phase, i.e., the comparison between the measured signal and the identified model output. The result of the test phase is reported in Fig. 6, for when a set of unknown inputs was introduced to the F.S.M.O.

The error over 90 patterns is 41.700616, the mean error for the patterns is 0.463340.
The error over 10 patterns is 5.269958, the mean error for the patterns is 0.526996.
The error over 10 patterns is 5.269958,
the mean error for the patterns is 0.526996.